Truth
Prediction Happy hits Happy pop Happy tunes
Happy hits 6 11 5
Happy pop 8 4 5
Happy tunes 6 5 10
# A tibble: 3 x 3
.metric .estimator .estimate
<chr> <chr> <dbl>
1 accuracy multiclass 0.333
2 kap multiclass 0
3 j_index macro 0
Findings
When looking at the k-nearest neighbour, we can see that the ‘happy’ playlists are hard to predict.When we look at the confusion matrix we see that Cohen’s kappa and Youden’s J are very low, which means that a guess is (purely) based on chance. The chance that it is guess right is about 0.33, which is predictable. When we look at all the other information on this page (e.g. the ‘mosaic’) we can see that the computer is having a hard time identifying the ‘Happy hits’ and ‘Happy pop’ playslists. Let’s take a look at how and if we can make the computer perform better.
# A tibble: 3 x 3
.metric .estimator .estimate
<chr> <chr> <dbl>
1 accuracy multiclass 0.383
2 kap multiclass 0.075
3 j_index macro 0.075
Findings
When using random forest to see what kind of features are most important in determining wich playlist you hear. These features are duration, instrumentalness and the timbre components c02, c05,c09 and c11.Based on these features I tried to make the computer perform better on determining which song belongs to which playlist. As you can see in the confusion matrix the difference isn’t (mostly) not that big and the computer is still having a hard time predicting as you can see in the ‘mosaic’ and the ‘heatmap’.
I also tried to make a plot that consists out of timbre component c11 (on the x-axis), timbre component c09 and the duration of the songs. It also shows which song belongs to which playlist. When looking at this plot there isn’t a real clear pattern to see. So I think we can conclude that there isn’t a clear correlation between all these features. I don’t think it is necessary to change my previous plots, because the improvement using these features is not that big and k-nearest neighbour also showed that it is already chance-based.
I also tried this procedure on the other playlists, but that gave me an even smaller difference. Namely (k-nearest neighbour):
# A tibble: 3 x 3
.metric .estimator .estimate
<chr> <chr> <dbl>
1 accuracy binary 0.375
2 kap binary -0.25
3 j_index binary -0.25
And (feature selection using the same features as for the ‘happy’ playlists):
# A tibble: 3 x 3
.metric .estimator .estimate
<chr> <chr> <dbl>
1 accuracy binary 0.45
2 kap binary -0.100
3 j_index binary -0.100
Introduction
What I would like to examine is ‘What does it mean to be a ‘happy’ playlist?’ or ‘What makes a playlist ‘happy’ if you compare it to playlists with negative emotions in the title?’
What I am going to do is comparing playlists with ‘happy’ in the title to playlists that have opposite emotions in the title (e.g. sad). My corpus consist out of three playlists with the ‘happy’ in the title. Two playlists exist out of 100 songs and the other one out of 80. The other part of the corpus exist out of two playlists (‘Life sucks’ and ‘Sad songs’) which contain 100 and 60 songs. So the corpus represents playlists that are meant to be ‘happy’ and playlists that are not meant to be ‘happy’. The label ‘happy’ is chosen by Spotify so it represents that label well, but the other playlists are chosen by me and I feel like these playlists are the opposite of ‘happy’, but since it is not chosen by Spotify it might be that these labels overlap or are not related at all.
First findings
There seems to be a significantly difference in energy for ‘happy’ playlists and the other Playlists. The happy playlists are more energetic than the other ones (M= .66, SD= .14). The energy level of the other playlists is rather low (M= .31, SD= .14). This seems to be a promising feature for identifying differences between ’Happy’playlists and playlists that are not. Other promising features are:
There’s a significantly difference in valence between the ‘happy’ playlists and the others. The ‘happy’ playlists have a much higher valence (M= .54, SD= .18). The valence for ‘Sad songs’ and ‘Life sucks’ is very low (M= .28, SD= .13).
The mean for the dancebility of the sad songs is at least 0.64 and the other playlists are lower (both 0.51). So ‘happy’ playlists have a higher danceability, but the difference is small.
What is remarkable is that there doesn’t really seem to be a difference in the mode. They are all mostly major. I expected there to be a difference. The happy songs (M= .66, SD= .48) in major and the other playlists (M= .78, SD= .42) in minor, but that is’nt the case.
There aren’t any extremes or outliers in my corpus, so I do not have to think about including or excluding any of them.
Findings
This plot is about the valence, energy, liveness and mode of the two kinds of playlists. It is based on the plot of Dr John Ashley Burgoyne that was showed in class. I wanted to use different colors from the same package, but I don’t know yet how to do that.The plot shows that the valence and energy for most songs in the happy playlists are high, while for the other kind of playlists the valence and energy are low. This is the same pattern as I already saw last week. Another pattern that I saw last week is that there isn’t really a difference in the mode of both kind of playlists. What also can be seen in the graph is that ‘Happy songs’ tend to have a higher liveness, then songs in the other playlists. When I checked this audio feature last week I didn’t see that there is such a difference between the playlists(M=.18, SD=.14 for ‘Happy’ and M=.13, SD=.06). This is something I didn’t expect to find, so I would like to investigate this further in the coming weeks.
Last week I thought there were’nt any outliers, but there seem to be some. I don’t think they will effect the results, because there are many songs that aren’t outliers. So I won’t take the outliers out.
Findings
In this part I added three graphs that are about the playlists. The first graphs you can see are about the difference between ‘happy playlists’ and the other kinds of playlists. There you can see that most songs are in c or c sharp. That is to be expected since c is a ‘basic’ key in music theory. What is remarkable is that the distribution in the middle section of the ‘happy playlists’ barchart is very high. There aren’t really any big differences in mode, while as you look at the barchart of the opposite playlists you see that not everything is (almost) the same. There is more variety in there, but this also doesn’t show a very big difference.
When we take a closer look at the different kinds of ‘happy playslist’ we see there are three different distributions for every playlist, but they do show some similarities. For example Happy pop and Happy tunes don’t have a lot of songs in d sharp, but do have a lot of songs in b and f sharp. When we compare Happy pop to Happy hits their shape is almost the same, but Happy hits contains a lot more song in g than does Happy pop.That they look so much the same might be due to that they contain the same (kind of) songs, but that is something I want to look at in the future weeks. Something that also can be seen in all of these graphs is that all the playlists contain a lot of songs in c. That is the same as we saw earlier when comparing all the Happy playlists to all the other playlists.
Now take a look at the third set of graphs that compare ‘Life sucks’ to ‘Sad songs’. Again most songs are in c, but what we can see now as well is that ‘Sad songs’ contains a lot of songs in every key, only the c stands out, but not as much as in the other graphs. From this graph we can conclude that sad songs are written in any key, but we must look at more playlists that are ‘sad’ to conclude it for the label of ‘sad’ that spotify uses. When we look at ‘Life sucks’ we see a graph that is comparable to the happy playlists. C stands out again and there also aren’t much songs in d sharp, just as the ‘Happy pop’ and ‘Happy tunes’ playlists. When we look at the middle section, the amount of songs per key is almost the same, but higher than for most keys in the last section. The last section shows the same pattern, but with less songs per key.
Overall, we can conclude that all playlists are mostly in c and that d sharp is not used often in these playlists. We can also conclude that there doesn’t really seem to be a difference in mode per kind of playlist. So what makes a ‘Happy playlist’ happy doesn’t lie in the key that is used.
Findings
This plot shows the timbre of ‘Sweet Caroline’, which is an outlier in the plot of week 7. As we can see it has almost an aba’ba’’b structure. This is what you might expect when looking at popmusic. Since a lot of popsongs are structured like this.So this doesn’t really explain what makes this song so different from other songs.
Findings
When we look at this chromagram, we kind of see the same as in the previous plot. We can still see the ababab-structure, but the third a seems to be very different when comparing it to the other a-sections. In this plot it’s easier to see that this part is not only shorter than the other a-sections, but that it also looks a lot like the b-sections. So this a-section might be why Sweet Caroline is an outlier.
Findings
When looking at this chromagram we see that this song is mostly in e minor. This is a bit strange if we look into the plot of week 7 where it says that the song is in major. You might expect ‘happy songs’ to be in major, but this song proves that there are also ‘happy songs’ in minor (if there is no mistake made in this chromagram).
Findings
This cepstrogram is about ‘kiss somebody’ which is a ‘typical happy song’ acoording to the plot of week 7. It doesn’t really show a clear structure when we look at c02, but when we look at c03 it looks like the songs is through-composed. There is no clear structure to be seen, since there are mostly purple and yellow beats with almost the same structre, but not as much the same so we can say there’s a clear structure. When we compare this ceptrogram to the cepstrogram of Sweet Caroline we see that there’s definitly a difference in structure. This is something you probably wouldn’t expect, since we’re still working with popmusic.
Findings
When we take a look at this chromagram we don’t see a really clear (musical) pattern. I feel like the chromagram should be moved up by a semitone. This will show d,e,g and c as the most used pitches, which looks a lot like a c major chord. This corresponds to the plots of week 8, where we already found that most (happy) songs are in c. So this is what you might expect.
Findings
At first this novelty function would probably look a bit confusing, but when you know the structure of Sweet Caroline, you can see it in this novelty function as well. So I will tell you what the structure of this song is: intro (i), verse (v), pre-chorus (pc), chorus (c), v, pc, c, instrumental part and outro. Knowing that the first verse starts around 13 seconds and the second verse starts around 82 seconds, you can see a clear pattern that repeats itself. Around 82 seconds in the novelty function it looks like the song is cut in two pieces, which is also the case if you look at the structure of the song. Around 157 you can see some ‘strange’ peaks. This is actually the part where only the instruments play. Around 170 seconds Neil Diamond starts singing again and the instruments become less loud until the song eventually fades out (around 170 seconds). This is also something you can see in the novelty function. Now let’s take a look at kiss somebody.
Findings
This novelty function shows less peaks than the novelty function of Sweet Caroline. This might be because Sweet Caroline contains much more ‘live’ instruments and Kiss somebody is made by electronic instruments so that probably wouldn’t make such a difference in loudness. That is also the reason why the structure of this song is hard to see in this novelty function, even when you know the structure of the song and the time stamps that go along with the different parts. At the end of the song it becomes harder to identify each part, because the parts are more compressed than earlier in the song. Compared to Sweet Caroline, you might want to conclude that typical ‘happy’ hits are harder to understand when looking at a novelty function due to there lack of big changes in loudness that might be caused by electronic instruments.
Findings
When we compare ‘happy’ playslists to other kinds of playlists, we can’t see big differences in the tempo (BPM) that is used per song. At a first glance it looks like there is a huge differenc in the distribution, but when you take a closer look, you can see that the ‘happy’ playlists dataset contains more songs and is like an ‘extreme’ version of the other kinds of playlists. The peaks around 75 and 125 are higher at the histogram of the ‘happy’ playlist, but the peaks are around the same tempo as in the histogram of the other kinds of playlists. So we can conclude that there doesn’t seem to be a real difference in the tempo for these playlists.
When we take a closer look at the different ‘happy’ playlists, we see the same kind of patterns with peaks around 75 and 125. When looking at the histograms of the other kinds of playlist, we see that ‘life sucks’ shows almost the same pattern as the first histogram that contains all ‘happy’ playlists. The only differences here are that the peak around 75 is higher than the one around 125 as is the case in the other histogram and that ‘life sucks’ contains less songs (obviously). There are more differences to see in the histogram of ‘sad songs’. The pattern is a bit different, but the overall shape is almost the same as the other histograms. Were the others got a peak around 75 that is higher than the peak containing 150, this histogram shows the exact opposite, but the difference isn’t that big, so it doesn’t affect the ´overall´ histogram of the playlists. In conlusion, the tempo of a happy playlist isn’t that different when comparing it to other kinds of playlists.
(I don’t know what the correct way of referring to the bars of the histogram is, so maybe you could help me with that.)
Learn how to change the colors in the chromagrams of week 9.
Incorporate ggplotly in my graphs from week 8 (and making different tabs to show all of the graphs)
Investigate the ‘liveness’ of happy songs according to the findings of week 7.
Look at the different (kind of) songs that are used in the ‘Happy pop’ and ‘Happy hits’ playlists to see if they overlap in some kind of way.
Make my dashboard more user friendly (e.g. adjusting the chart sizes and make several graphs on a tab more accesible).
Change the tab titles so that they tell a story
Add findings for week 10